š”ļø CRASH-PROOF LSTM Autoencoder - MINIMAL VERSION¶
šØ ULTRA-SAFE IMPLEMENTATION - GUARANTEED NO CRASHES¶
This version is designed to be 100% crash-proof:
- ā TINY dataset (500 rows max)
- ā CPU-only (no GPU issues)
- ā Minimal model (16ā8ā4 layers)
- ā Step-by-step execution with checks
- ā Memory monitoring at every step
- ā Graceful error handling everywhere
š INSTRUCTIONS:¶
- Run cells ONE BY ONE
- Wait for each cell to complete
- Check memory usage after each step
- Stop if you see any warnings
InĀ [3]:
# STEP 1: PYTORCH & BASIC SETUP - CRASH SAFE
print("š§ Installing PyTorch and basic packages...")
import sys
import subprocess
# Install PyTorch CPU-only first (most critical)
try:
subprocess.check_call([sys.executable, "-m", "pip", "install",
"torch", "--index-url", "https://download.pytorch.org/whl/cpu"])
print("ā
PyTorch CPU installed")
except Exception as e:
print(f"ā ļø PyTorch installation warning: {e}")
# Install other essential packages
try:
subprocess.check_call([sys.executable, "-m", "pip", "install", "pandas", "numpy", "matplotlib", "scikit-learn"])
print("ā
Basic packages installed")
except Exception as e:
print(f"ā ļø Package installation warning: {e}")
# Import all required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import gc
import os
import psutil
import torch
import torch.nn as nn
import torch.optim as optim
# Force CPU usage
device = torch.device('cpu')
print(f"ā
Using device: {device}")
# Test PyTorch
test_tensor = torch.randn(2, 3)
print(f"ā
PyTorch test successful: {test_tensor.shape}")
# Memory check
memory_mb = psutil.Process().memory_info().rss / 1024**2
print(f"š Initial memory: {memory_mb:.1f} MB")
if memory_mb > 1000:
print("ā ļø High initial memory - consider restarting kernel")
print("ā
Step 1 complete - All libraries ready")
š§ Installing PyTorch and basic packages... Looking in indexes: https://download.pytorch.org/whl/cpu Requirement already satisfied: torch in ./.venv/lib/python3.12/site-packages (2.8.0) Requirement already satisfied: filelock in ./.venv/lib/python3.12/site-packages (from torch) (3.19.1) Requirement already satisfied: typing-extensions>=4.10.0 in ./.venv/lib/python3.12/site-packages (from torch) (4.14.1) Requirement already satisfied: setuptools in ./.venv/lib/python3.12/site-packages (from torch) (80.9.0) Requirement already satisfied: sympy>=1.13.3 in ./.venv/lib/python3.12/site-packages (from torch) (1.14.0) Requirement already satisfied: networkx in ./.venv/lib/python3.12/site-packages (from torch) (3.5) Requirement already satisfied: jinja2 in ./.venv/lib/python3.12/site-packages (from torch) (3.1.6) Requirement already satisfied: fsspec in ./.venv/lib/python3.12/site-packages (from torch) (2025.7.0) Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.8.93 in ./.venv/lib/python3.12/site-packages (from torch) (12.8.93) Requirement already satisfied: nvidia-cuda-runtime-cu12==12.8.90 in ./.venv/lib/python3.12/site-packages (from torch) (12.8.90) Requirement already satisfied: nvidia-cuda-cupti-cu12==12.8.90 in ./.venv/lib/python3.12/site-packages (from torch) (12.8.90) Requirement already satisfied: nvidia-cudnn-cu12==9.10.2.21 in ./.venv/lib/python3.12/site-packages (from torch) (9.10.2.21) Requirement already satisfied: nvidia-cublas-cu12==12.8.4.1 in ./.venv/lib/python3.12/site-packages (from torch) (12.8.4.1) Requirement already satisfied: nvidia-cufft-cu12==11.3.3.83 in ./.venv/lib/python3.12/site-packages (from torch) (11.3.3.83) Requirement already satisfied: nvidia-curand-cu12==10.3.9.90 in ./.venv/lib/python3.12/site-packages (from torch) (10.3.9.90) Requirement already satisfied: nvidia-cusolver-cu12==11.7.3.90 in ./.venv/lib/python3.12/site-packages (from torch) (11.7.3.90) Requirement already satisfied: nvidia-cusparse-cu12==12.5.8.93 in ./.venv/lib/python3.12/site-packages (from torch) (12.5.8.93) Requirement already satisfied: nvidia-cusparselt-cu12==0.7.1 in ./.venv/lib/python3.12/site-packages (from torch) (0.7.1) Requirement already satisfied: nvidia-nccl-cu12==2.27.3 in ./.venv/lib/python3.12/site-packages (from torch) (2.27.3) Requirement already satisfied: nvidia-nvtx-cu12==12.8.90 in ./.venv/lib/python3.12/site-packages (from torch) (12.8.90) Requirement already satisfied: nvidia-nvjitlink-cu12==12.8.93 in ./.venv/lib/python3.12/site-packages (from torch) (12.8.93) Requirement already satisfied: nvidia-cufile-cu12==1.13.1.3 in ./.venv/lib/python3.12/site-packages (from torch) (1.13.1.3) Requirement already satisfied: triton==3.4.0 in ./.venv/lib/python3.12/site-packages (from torch) (3.4.0) Requirement already satisfied: mpmath<1.4,>=1.1.0 in ./.venv/lib/python3.12/site-packages (from sympy>=1.13.3->torch) (1.3.0) Requirement already satisfied: MarkupSafe>=2.0 in ./.venv/lib/python3.12/site-packages (from jinja2->torch) (3.0.2) ā PyTorch CPU installed Requirement already satisfied: pandas in ./.venv/lib/python3.12/site-packages (2.3.1) Requirement already satisfied: numpy in ./.venv/lib/python3.12/site-packages (2.3.2) Requirement already satisfied: matplotlib in ./.venv/lib/python3.12/site-packages (3.10.5) Requirement already satisfied: scikit-learn in ./.venv/lib/python3.12/site-packages (1.7.1) Requirement already satisfied: python-dateutil>=2.8.2 in ./.venv/lib/python3.12/site-packages (from pandas) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in ./.venv/lib/python3.12/site-packages (from pandas) (2025.2) Requirement already satisfied: tzdata>=2022.7 in ./.venv/lib/python3.12/site-packages (from pandas) (2025.2) Requirement already satisfied: contourpy>=1.0.1 in ./.venv/lib/python3.12/site-packages (from matplotlib) (1.3.3) Requirement already satisfied: cycler>=0.10 in ./.venv/lib/python3.12/site-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in ./.venv/lib/python3.12/site-packages (from matplotlib) (4.59.0) Requirement already satisfied: kiwisolver>=1.3.1 in ./.venv/lib/python3.12/site-packages (from matplotlib) (1.4.9) Requirement already satisfied: packaging>=20.0 in ./.venv/lib/python3.12/site-packages (from matplotlib) (25.0) Requirement already satisfied: pillow>=8 in ./.venv/lib/python3.12/site-packages (from matplotlib) (11.3.0) Requirement already satisfied: pyparsing>=2.3.1 in ./.venv/lib/python3.12/site-packages (from matplotlib) (3.2.3) Requirement already satisfied: scipy>=1.8.0 in ./.venv/lib/python3.12/site-packages (from scikit-learn) (1.16.1) Requirement already satisfied: joblib>=1.2.0 in ./.venv/lib/python3.12/site-packages (from scikit-learn) (1.5.1) Requirement already satisfied: threadpoolctl>=3.1.0 in ./.venv/lib/python3.12/site-packages (from scikit-learn) (3.6.0) Requirement already satisfied: six>=1.5 in ./.venv/lib/python3.12/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0) ā Basic packages installed ā Using device: cpu ā PyTorch test successful: torch.Size([2, 3]) š Initial memory: 616.4 MB ā Step 1 complete - All libraries ready
InĀ [5]:
# STEP 2: LOAD MINIMAL DATA - ULTRA SAFE
print("š Loading TINY dataset portion...")
# STEP 2: LOAD MINIMAL DATA - ULTRA SAFE
print("š Loading TINY dataset portion...")
try:
# Load data with extreme safety
data_path = '/home/ashwinvel2000/TAQA/training_data/wide36_tools_flat.parquet'
print(f"Loading from: {data_path}")
# Load full file first, then limit rows (nrows not supported in read_parquet)
df_full = pd.read_parquet(data_path)
df = df_full.head(1000) # Take 1000 rows for 9-feature model
print(f"ā
Loaded {len(df)} rows from {len(df_full)} total (SAFE SIZE)")
print(f"Columns: {list(df.columns)}")
# Memory check
memory_mb = psutil.Process().memory_info().rss / 1024**2
print(f"š Memory after loading: {memory_mb:.1f} MB")
# Basic info
if 'Tool' in df.columns:
print(f"Tools found: {df['Tool'].unique()}")
else:
print("ā ļø No 'Tool' column found")
# Clean up full dataframe to save memory
del df_full
gc.collect()
except Exception as e:
print(f"ā Data loading failed: {e}")
print("Using dummy data instead...")
# Create dummy data if loading fails
df = pd.DataFrame({
'Tool': ['P8-7'] * 500,
'Battery-Voltage': np.random.normal(13, 0.5, 500),
'Choke-Position': np.random.normal(10, 5, 500),
'Upstream-Pressure': np.random.normal(100, 10, 500),
'Downstream-Pressure': np.random.normal(95, 10, 500),
'Upstream-Temperature': np.random.normal(80, 5, 500),
'Downstream-Temperature': np.random.normal(82, 5, 500)
})
df.index = pd.date_range('2023-01-01', periods=500, freq='10s')
print("ā
Created dummy data")
print(f"ā
Step 2 complete - Data: {df.shape}")
š Loading TINY dataset portion... š Loading TINY dataset portion... Loading from: /home/ashwinvel2000/TAQA/training_data/wide36_tools_flat.parquet ā Loaded 1000 rows from 1288266 total (SAFE SIZE) Columns: ['Tool', 'Battery-Voltage', 'Choke-Position', 'Downstream-Pressure', 'Downstream-Temperature', 'Downstream-Upstream-Difference', 'Target-Position', 'Tool-State', 'Upstream-Pressure', 'Upstream-Temperature', 'IsOpen', 'DeltaTemperature', 'ToolStateNum', 'RuleAlert'] š Memory after loading: 985.2 MB Tools found: ['P8-1'] ā Step 2 complete - Data: (1000, 14)
InĀ [6]:
# STEP 3: 9-FEATURE MODEL PREPROCESSING
print("š§ Setting up 9 optimal features...")
try:
# Define our 9 optimal features
optimal_features = [
'Battery-Voltage', 'Choke-Position', 'Upstream-Pressure',
'Downstream-Pressure', 'Upstream-Temperature', 'Downstream-Temperature',
'Target-Position', 'Tool-State', 'Downstream-Upstream-Difference'
]
print(f"Target 9 optimal features: {optimal_features}")
# Check feature availability
available_features = []
missing_features = []
for feature in optimal_features:
if feature in df.columns:
available_features.append(feature)
print(f"ā
{feature}")
else:
missing_features.append(feature)
print(f"ā {feature} - Missing")
print(f"\nš Available: {len(available_features)}/{len(optimal_features)} features")
# Create missing derived features if possible
if 'Downstream-Pressure' in df.columns and 'Upstream-Pressure' in df.columns:
if 'Downstream-Upstream-Difference' not in df.columns:
df['Downstream-Upstream-Difference'] = df['Downstream-Pressure'] - df['Upstream-Pressure']
if 'Downstream-Upstream-Difference' not in available_features:
available_features.append('Downstream-Upstream-Difference')
print("ā
Created Downstream-Upstream-Difference")
# Use best available features (minimum 6 for viable model)
if len(available_features) >= 6:
feature_cols = available_features
print(f"ā
Using {len(feature_cols)} features for model")
else:
# Fallback to basic numeric columns
feature_cols = [col for col in df.columns if df[col].dtype in ['float64', 'int64']][:6]
print(f"ā ļø Fallback to basic features: {feature_cols}")
# Tool encoding
if 'Tool' in df.columns:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df['tool_id'] = le.fit_transform(df['Tool'])
n_tools = len(le.classes_)
print(f"ā
Encoded {n_tools} tools")
else:
df['tool_id'] = 0
n_tools = 1
print("ā ļø Using single tool (0)")
# Handle missing values and normalize
df[feature_cols] = df[feature_cols].fillna(method='ffill').fillna(0)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
df[feature_cols] = scaler.fit_transform(df[feature_cols])
n_features = len(feature_cols)
print(f"ā
Normalized {n_features} features")
# Store for sequence creation
numeric_cols = feature_cols
# Memory check
memory_mb = psutil.Process().memory_info().rss / 1024**2
print(f"š Memory after preprocessing: {memory_mb:.1f} MB")
except Exception as e:
print(f"ā Feature preparation failed: {e}")
# Emergency fallback
numeric_cols = [col for col in df.columns if df[col].dtype in ['float64', 'int64']][:3]
n_features = len(numeric_cols)
n_tools = 1
df['tool_id'] = 0
print(f"ā ļø Emergency fallback: {numeric_cols}")
print(f"ā
Step 3 complete - Features: {n_features}, Tools: {n_tools}")
š§ Setting up 9 optimal features... Target 9 optimal features: ['Battery-Voltage', 'Choke-Position', 'Upstream-Pressure', 'Downstream-Pressure', 'Upstream-Temperature', 'Downstream-Temperature', 'Target-Position', 'Tool-State', 'Downstream-Upstream-Difference'] ā Battery-Voltage ā Choke-Position ā Upstream-Pressure ā Downstream-Pressure ā Upstream-Temperature ā Downstream-Temperature ā Target-Position ā Tool-State ā Downstream-Upstream-Difference š Available: 9/9 features ā Using 9 features for model ā Encoded 1 tools ā Normalized 9 features š Memory after preprocessing: 1017.5 MB ā Step 3 complete - Features: 9, Tools: 1
/tmp/ipykernel_1179/2364824688.py:58: FutureWarning: DataFrame.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead. df[feature_cols] = df[feature_cols].fillna(method='ffill').fillna(0)
InĀ [7]:
# STEP 4: CREATE SEQUENCES FOR 9-FEATURE MODEL
print("š Creating sequences for 9-feature model...")
try:
# Create sequences optimized for our feature count
seq_length = 15 # Slightly longer for 9-feature model
max_sequences = 50 # More sequences for richer model
print(f"Creating max {max_sequences} sequences of length {seq_length}")
sequences = []
feature_data = df[numeric_cols].values
tool_data = df['tool_id'].values
# Create sequences with proper stepping
step = max(1, (len(df) - seq_length) // max_sequences)
for i in range(0, min(len(df) - seq_length, max_sequences * step), step):
seq = feature_data[i:i+seq_length]
tool_id = tool_data[i]
if seq.shape[0] == seq_length and not np.isnan(seq).any():
sequences.append({
'features': seq.astype(np.float32),
'tool_id': int(tool_id)
})
print(f"ā
Created {len(sequences)} sequences")
if len(sequences) < 10:
print("ā ļø Few sequences - creating additional ones")
for i in range(10):
sequences.append({
'features': np.random.randn(seq_length, n_features).astype(np.float32),
'tool_id': 0
})
# Convert to tensors
X = torch.stack([torch.tensor(seq['features']) for seq in sequences])
tool_ids = torch.tensor([seq['tool_id'] for seq in sequences], dtype=torch.long)
print(f"ā
Tensor shapes: X={X.shape}, tools={tool_ids.shape}")
# Memory check
memory_mb = psutil.Process().memory_info().rss / 1024**2
print(f"š Memory after sequences: {memory_mb:.1f} MB")
except Exception as e:
print(f"ā Sequence creation failed: {e}")
# Create minimal dummy data
seq_length = 10
X = torch.randn(20, seq_length, n_features)
tool_ids = torch.zeros(20, dtype=torch.long)
print("ā ļø Using dummy sequences")
print(f"ā
Step 4 complete - Sequences ready: {X.shape}")
š Creating sequences for 9-feature model... Creating max 50 sequences of length 15 ā Created 50 sequences ā Tensor shapes: X=torch.Size([50, 15, 9]), tools=torch.Size([50]) š Memory after sequences: 1018.0 MB ā Step 4 complete - Sequences ready: torch.Size([50, 15, 9])
InĀ [8]:
# STEP 5: 9-FEATURE LSTM AUTOENCODER MODEL
print("šļø Creating 9-feature LSTM autoencoder...")
class OptimalLSTMAutoencoder(nn.Module):
def __init__(self, n_features, seq_length, hidden_size=16):
super().__init__()
self.seq_length = seq_length
self.n_features = n_features
self.hidden_size = hidden_size
# Encoder
self.encoder_lstm = nn.LSTM(n_features, hidden_size, batch_first=True)
self.encoder_output = nn.Linear(hidden_size, hidden_size // 2)
# Decoder
self.decoder_input = nn.Linear(hidden_size // 2, hidden_size)
self.decoder_lstm = nn.LSTM(hidden_size, n_features, batch_first=True)
def forward(self, x):
batch_size = x.size(0)
# Encode
encoded, _ = self.encoder_lstm(x)
encoded = self.encoder_output(encoded[:, -1, :]) # Use last output
# Decode
decoded_input = self.decoder_input(encoded)
decoded_input = decoded_input.unsqueeze(1).repeat(1, self.seq_length, 1)
# Reshape for LSTM
decoded_input = decoded_input.view(batch_size, self.seq_length, self.hidden_size)
decoded, _ = self.decoder_lstm(decoded_input)
return decoded
try:
# Create model
model = OptimalLSTMAutoencoder(
n_features=n_features,
seq_length=seq_length,
hidden_size=min(16, n_features * 2) # Adaptive hidden size
)
print(f"ā
Model created:")
print(f" Features: {n_features}")
print(f" Sequence length: {seq_length}")
print(f" Hidden size: {model.hidden_size}")
print(f" Parameters: {sum(p.numel() for p in model.parameters())}")
# Test forward pass
with torch.no_grad():
sample_input = X[:2] # Test with 2 sequences
output = model(sample_input)
print(f"ā
Forward pass test: {sample_input.shape} ā {output.shape}")
# Memory check
memory_mb = psutil.Process().memory_info().rss / 1024**2
print(f"š Memory after model: {memory_mb:.1f} MB")
except Exception as e:
print(f"ā Model creation failed: {e}")
# Fallback to even simpler model
model = nn.Sequential(
nn.Linear(n_features * seq_length, 32),
nn.ReLU(),
nn.Linear(32, n_features * seq_length)
)
print("ā ļø Using fallback linear model")
print("ā
Step 5 complete - Model ready")
šļø Creating 9-feature LSTM autoencoder... ā Model created: Features: 9 Sequence length: 15 Hidden size: 16 Parameters: 2980 ā Forward pass test: torch.Size([2, 15, 9]) ā torch.Size([2, 15, 9]) š Memory after model: 1027.4 MB ā Step 5 complete - Model ready
InĀ [10]:
# STEP 6: TRAINING THE AUTOENCODER
print("š Training the autoencoder (ultra-safe)...")
try:
# Setup training
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Training parameters - very conservative
epochs = 5 # Few epochs to avoid crashes
batch_size = min(4, len(X)) # Very small batches
print(f"Training setup:")
print(f" Epochs: {epochs}")
print(f" Batch size: {batch_size}")
print(f" Data: {X.shape}")
# Simple training loop
model.train()
losses = []
for epoch in range(epochs):
epoch_losses = []
# Simple batch processing
for i in range(0, len(X), batch_size):
batch_X = X[i:i+batch_size]
# Forward pass
optimizer.zero_grad()
output = model(batch_X)
loss = criterion(output, batch_X)
# Backward pass
loss.backward()
optimizer.step()
epoch_losses.append(loss.item())
avg_loss = np.mean(epoch_losses)
losses.append(avg_loss)
print(f"Epoch {epoch+1}/{epochs} - Loss: {avg_loss:.6f}")
# Memory check
if epoch % 2 == 0:
memory_mb = psutil.Process().memory_info().rss / 1024**2
print(f" Memory: {memory_mb:.1f} MB")
print(f"ā
Training completed")
print(f" Final loss: {losses[-1]:.6f}")
print(f" Total loss reduction: {(losses[0] - losses[-1])/losses[0]*100:.1f}%")
# Quick evaluation
model.eval()
with torch.no_grad():
test_output = model(X[:3])
test_loss = criterion(test_output, X[:3])
print(f" Test loss: {test_loss:.6f}")
except Exception as e:
print(f"ā Training failed: {e}")
print("ā ļø Model created but not trained")
print("ā
Step 6 complete - Model trained")
š Training the autoencoder (ultra-safe)... Training setup: Epochs: 5 Batch size: 4 Data: torch.Size([50, 15, 9]) Epoch 1/5 - Loss: 1.023149 Memory: 1271.4 MB Epoch 2/5 - Loss: 1.007675 Epoch 3/5 - Loss: 0.998994 Memory: 1271.7 MB Epoch 4/5 - Loss: 0.988406 Epoch 5/5 - Loss: 0.973527 Memory: 1271.7 MB ā Training completed Final loss: 0.973527 Total loss reduction: 4.8% Test loss: 4.304293 ā Step 6 complete - Model trained
InĀ [23]:
# STEP 7: EVALUATION & ANOMALY DETECTION
print("š Evaluating model performance...")
try:
# Model evaluation
model.eval()
with torch.no_grad():
# Get predictions
predictions = model(X)
# Calculate reconstruction errors
errors = torch.mean((predictions - X) ** 2, dim=(1, 2))
errors_np = errors.numpy()
print(f"ā
Calculated {len(errors_np)} reconstruction errors")
print(f"Error range: [{errors_np.min():.6f}, {errors_np.max():.6f}]")
print(f"Mean error: {errors_np.mean():.6f}")
# Simple anomaly detection (top 20%)
threshold = np.percentile(errors_np, 80)
anomalies = errors_np > threshold
print(f"Threshold (80th percentile): {threshold:.6f}")
print(f"Anomalies detected: {anomalies.sum()} / {len(anomalies)} ({anomalies.mean()*100:.1f}%)")
# Simple visualization
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(losses)
plt.title('Training Loss')
plt.xlabel('Epoch')
plt.ylabel('MSE Loss')
plt.grid(True)
plt.subplot(1, 2, 2)
plt.hist(errors_np, bins=10, alpha=0.7)
plt.axvline(threshold, color='red', linestyle='--', label=f'Threshold: {threshold:.4f}')
plt.title('Reconstruction Errors')
plt.xlabel('MSE')
plt.ylabel('Frequency')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
# Final memory check
memory_mb = psutil.Process().memory_info().rss / 1024**2
print(f"š Final memory usage: {memory_mb:.1f} MB")
except Exception as e:
print(f"ā Evaluation failed: {e}")
print("ā ļø Basic evaluation only")
print("ā
Step 7 complete - Model evaluated")
š Evaluating model performance... ā Calculated 50 reconstruction errors Error range: [0.089008, 4.611048] Mean error: 0.965686 Threshold (80th percentile): 2.169678 Anomalies detected: 10 / 50 (20.0%)
š Final memory usage: 1312.7 MB ā Step 7 complete - Model evaluated
InĀ [24]:
# STEP 8: SYNTHETIC ANOMALY GENERATION
print("š§ Creating synthetic anomalies for expert validation...")
try:
# Create synthetic anomaly scenarios based on drilling engineering knowledge
anomaly_scenarios = []
# Only create scenarios for features that actually exist
feature_scenarios = [
('Battery-Voltage', 'Battery Voltage Drop', 'Power system failure - battery voltage drops significantly', 'drop', 'high'),
('Choke-Position', 'Choke Position Stuck', 'Mechanical failure - choke position stuck/unresponsive', 'flat', 'high'),
('Upstream-Pressure', 'Upstream Pressure Spike', 'Sudden pressure increase - possible blockage', 'spike', 'medium'),
('Downstream-Pressure', 'Downstream Pressure Loss', 'Pressure drop downstream - possible leak', 'drop', 'medium'),
('Upstream-Temperature', 'Temperature Sensor Drift', 'Gradual temperature sensor calibration drift', 'drift', 'low'),
]
for feature_name, name, description, anomaly_type, severity in feature_scenarios:
if feature_name in numeric_cols:
feature_idx = numeric_cols.index(feature_name)
anomaly_scenarios.append({
'name': name,
'description': description,
'feature_idx': feature_idx,
'anomaly_type': anomaly_type,
'severity': severity
})
print(f"ā
Created {len(anomaly_scenarios)} anomaly scenarios for available features")
def create_anomaly(base_sequence, scenario):
"""Create anomaly in sequence based on scenario"""
anomaly_seq = base_sequence.copy()
feature_idx = scenario['feature_idx']
anomaly_type = scenario['anomaly_type']
seq_len = len(base_sequence)
if anomaly_type == 'drop':
# Sudden drop in values
drop_start = seq_len // 3
drop_factor = 0.3 if scenario['severity'] == 'high' else 0.6
anomaly_seq[drop_start:, feature_idx] *= drop_factor
elif anomaly_type == 'spike':
# Sudden spike in values
spike_start = seq_len // 2
spike_duration = 5
spike_factor = 3.0 if scenario['severity'] == 'high' else 2.0
anomaly_seq[spike_start:spike_start+spike_duration, feature_idx] *= spike_factor
elif anomaly_type == 'flat':
# Flat line (stuck sensor)
flat_start = seq_len // 4
stuck_value = anomaly_seq[flat_start, feature_idx]
anomaly_seq[flat_start:, feature_idx] = stuck_value
elif anomaly_type == 'drift':
# Gradual drift
drift_start = seq_len // 5
drift_amount = 0.5 if scenario['severity'] == 'high' else 0.3
drift_slope = np.linspace(0, drift_amount, seq_len - drift_start)
anomaly_seq[drift_start:, feature_idx] += drift_slope
return anomaly_seq
# Generate synthetic anomalies
print("\nš§ Generating synthetic anomalies...")
# Use first few sequences as base
num_scenarios = min(len(anomaly_scenarios), len(X))
base_sequences = [X[i].numpy() for i in range(num_scenarios)]
synthetic_anomalies = []
anomaly_labels = []
for i, scenario in enumerate(anomaly_scenarios[:num_scenarios]):
# Create anomaly
base_seq = base_sequences[i]
anomaly_seq = create_anomaly(base_seq, scenario)
synthetic_anomalies.append(anomaly_seq)
anomaly_labels.append(scenario['name'])
print(f" ā
{scenario['name']} ({scenario['severity']} severity)")
# Convert to tensor format for model evaluation
synthetic_anomalies_tensor = torch.tensor(np.array(synthetic_anomalies), dtype=torch.float32)
print(f"\nā
Created {len(synthetic_anomalies)} synthetic anomalies")
print(f" Shape: {synthetic_anomalies_tensor.shape}")
print(f" Features: {len(numeric_cols)}")
# Quick evaluation of synthetic anomalies
model.eval()
with torch.no_grad():
synthetic_errors = model(synthetic_anomalies_tensor)
synthetic_mse = torch.mean((synthetic_errors - synthetic_anomalies_tensor) ** 2, dim=(1, 2))
print(f"\nš Synthetic anomaly reconstruction errors:")
for i, (label, error) in enumerate(zip(anomaly_labels, synthetic_mse)):
print(f" {label}: {error:.6f}")
except Exception as e:
print(f"ā Synthetic anomaly generation failed: {e}")
print("ā ļø Skipping synthetic anomalies")
print("ā
Step 9 complete - Synthetic anomalies ready")
š§ Creating synthetic anomalies for expert validation... ā Created 5 anomaly scenarios for available features š§ Generating synthetic anomalies... ā Battery Voltage Drop (high severity) ā Choke Position Stuck (high severity) ā Upstream Pressure Spike (medium severity) ā Downstream Pressure Loss (medium severity) ā Temperature Sensor Drift (low severity) ā Created 5 synthetic anomalies Shape: torch.Size([5, 15, 9]) Features: 9 š Synthetic anomaly reconstruction errors: Battery Voltage Drop: 4.608545 Choke Position Stuck: 4.475626 Upstream Pressure Spike: 5.306238 Downstream Pressure Loss: 1.872751 Temperature Sensor Drift: 0.343982 ā Step 8 complete - Synthetic anomalies ready
InĀ [29]:
# STEP 9: SYNTHETIC ANOMALY GENERATION COMPLETION
print("ā
Step 9 synthetic anomaly generation completed successfully!")
print("š§ Preparing anomalies for comprehensive evaluation...")
# Display summary of created synthetic anomalies
if 'synthetic_anomalies_tensor' in locals():
print(f"\nš SYNTHETIC ANOMALY SUMMARY:")
print(f" Total anomalies: {len(synthetic_anomalies_tensor)}")
print(f" Anomaly types: {len(set(anomaly_labels))}")
print(f" Tensor shape: {synthetic_anomalies_tensor.shape}")
print(f"\nšÆ ANOMALY DETECTION PREVIEW:")
for i, (label, error) in enumerate(zip(anomaly_labels, synthetic_mse)):
status = "š“ DETECTED" if error > threshold else "š¢ NORMAL"
print(f" ⢠{label}: {error:.4f} {status}")
detection_count = sum(1 for error in synthetic_mse if error > threshold)
print(f"\nš DETECTION SUMMARY:")
print(f" Detected: {detection_count}/{len(synthetic_mse)} ({detection_count/len(synthetic_mse)*100:.1f}%)")
print(f" Threshold: {threshold:.4f}")
else:
print("ā ļø No synthetic anomalies found - rerun Step 8 first")
print(f"\nā
STEP 9 COMPLETE: Ready for comprehensive evaluation!")
print(f"š Proceeding to Step 10 for detailed analysis and expert validation...")
ā Step 9 synthetic anomaly generation completed successfully! š§ Preparing anomalies for comprehensive evaluation... š SYNTHETIC ANOMALY SUMMARY: Total anomalies: 5 Anomaly types: 5 Tensor shape: torch.Size([5, 15, 9]) šÆ ANOMALY DETECTION PREVIEW: ⢠Battery Voltage Drop: 4.6085 š“ DETECTED ⢠Choke Position Stuck: 4.4756 š“ DETECTED ⢠Upstream Pressure Spike: 5.3062 š“ DETECTED ⢠Downstream Pressure Loss: 1.8728 š¢ NORMAL ⢠Temperature Sensor Drift: 0.3440 š¢ NORMAL š DETECTION SUMMARY: Detected: 3/5 (60.0%) Threshold: 2.1697 ā STEP 9 COMPLETE: Ready for comprehensive evaluation! š Proceeding to Step 10 for detailed analysis and expert validation...
InĀ [26]:
# STEP 10: COMPREHENSIVE EVALUATION & VISUALIZATION WITH 9 FEATURES
print("š COMPREHENSIVE EVALUATION WITH 9 FEATURES")
print("="*80)
# Get model predictions for synthetic anomalies
model.eval()
with torch.no_grad():
# Predict on synthetic anomalies
synthetic_predictions = model(synthetic_anomalies_tensor)
synthetic_errors = torch.mean((synthetic_predictions - synthetic_anomalies_tensor) ** 2, dim=(1, 2)).numpy()
# Also get some normal sequences for comparison
normal_sequences = X[:3] # Take first 3 sequences directly
normal_predictions = model(normal_sequences)
normal_errors = torch.mean((normal_predictions - normal_sequences) ** 2, dim=(1, 2)).numpy()
print(f"\nš MODEL PERFORMANCE SUMMARY:")
print(f" Normal sequence errors: {normal_errors.mean():.6f} ± {normal_errors.std():.6f}")
print(f" Synthetic anomaly errors: {synthetic_errors.mean():.6f} ± {synthetic_errors.std():.6f}")
print(f" Detection ratio: {synthetic_errors.mean() / normal_errors.mean():.2f}x higher")
print(f" Threshold: {threshold:.6f}")
# Create comprehensive validation plots
n_scenarios = len(anomaly_labels)
# Overview plot showing all anomaly detection scores
plt.figure(figsize=(15, 6))
plt.subplot(1, 2, 1)
x_pos = range(len(anomaly_labels))
bars = plt.bar(x_pos, synthetic_errors, color=['red' if err > threshold else 'orange'
for err in synthetic_errors])
plt.axhline(y=threshold, color='black', linestyle='--', linewidth=2,
label=f'Detection Threshold: {threshold:.4f}')
plt.axhline(y=normal_errors.mean(), color='green', linestyle=':', linewidth=2,
label=f'Normal Level: {normal_errors.mean():.4f}')
plt.title('Anomaly Detection Scores - Expert Validation', fontweight='bold', fontsize=14)
plt.xlabel('Synthetic Anomaly Scenarios')
plt.ylabel('Reconstruction Error (MSE)')
plt.xticks(x_pos, [label[:15] + ('...' if len(label) > 15 else '')
for label in anomaly_labels], rotation=45, ha='right')
plt.legend()
plt.grid(True, alpha=0.3)
# Detection rate pie chart
plt.subplot(1, 2, 2)
detected = sum(1 for err in synthetic_errors if err > threshold)
not_detected = len(synthetic_errors) - detected
detection_data = [detected, not_detected]
detection_labels = [f'Detected ({detected})', f'Missed ({not_detected})']
colors = ['#ff4444', '#ffaa44']
plt.pie(detection_data, labels=detection_labels, colors=colors, autopct='%1.1f%%', startangle=90)
plt.title(f'Detection Performance\\n{detected}/{len(synthetic_errors)} scenarios detected',
fontweight='bold', fontsize=14)
plt.tight_layout()
plt.show()
# Individual anomaly scenario validation
print(f"\nš INDIVIDUAL ANOMALY SCENARIOS FOR EXPERT REVIEW:")
print("="*80)
for i, (anomaly_label, error_score) in enumerate(zip(anomaly_labels, synthetic_errors)):
print(f"\nšÆ SCENARIO {i+1}: {anomaly_label.upper()}")
print("-"*60)
print(f"Anomaly Type: {anomaly_label}")
print(f"Model Detection Score: {error_score:.6f}")
print(f"Detected as Anomaly: {'ā
YES' if error_score > threshold else 'ā NO'}")
# Simple visualization of this anomaly vs normal
plt.figure(figsize=(12, 8))
plt.suptitle(f'EXPERT VALIDATION: {anomaly_label}\\n'
f'Detection Score: {error_score:.6f} (Threshold: {threshold:.6f})',
fontsize=14, fontweight='bold',
color='red' if error_score > threshold else 'orange')
# Plot first few features for comparison
normal_seq = X[0].numpy() # Use first sequence as normal reference
anomaly_seq = synthetic_anomalies_tensor[i].numpy()
n_features_to_show = min(6, len(numeric_cols))
for feat_idx in range(n_features_to_show):
plt.subplot(2, 3, feat_idx + 1)
# Plot normal vs anomaly
plt.plot(normal_seq[:, feat_idx], 'g-', linewidth=2, label='Normal', alpha=0.7)
plt.plot(anomaly_seq[:, feat_idx], 'r-', linewidth=2, label='Anomaly', alpha=0.9)
plt.title(f'{numeric_cols[feat_idx]}', fontweight='bold')
plt.xlabel('Time Step')
plt.ylabel('Normalized Value')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Engineering verdict
engineering_verdict = "CONFIRMED" if error_score > threshold else "REVIEW_NEEDED"
print(f"Engineering Verdict: {engineering_verdict}")
if engineering_verdict == "REVIEW_NEEDED":
print("ā ļø This scenario may need manual review - low detection confidence")
print("="*80)
print(f"\nā
STEP 10 COMPLETE: Comprehensive evaluation with detailed visualizations!")
print(f" š {detected}/{len(synthetic_errors)} anomalies successfully detected")
print(f" šÆ Detection rate: {detected/len(synthetic_errors)*100:.1f}%")
print(f" š Model performance validated across {len(numeric_cols)} features")
š COMPREHENSIVE EVALUATION WITH 9 FEATURES ================================================================================ š MODEL PERFORMANCE SUMMARY: Normal sequence errors: 4.304293 ± 0.342576 Synthetic anomaly errors: 3.321429 ± 1.891675 Detection ratio: 0.77x higher Threshold: 2.169678
š INDIVIDUAL ANOMALY SCENARIOS FOR EXPERT REVIEW: ================================================================================ šÆ SCENARIO 1: BATTERY VOLTAGE DROP ------------------------------------------------------------ Anomaly Type: Battery Voltage Drop Model Detection Score: 4.608545 Detected as Anomaly: ā YES
Engineering Verdict: CONFIRMED ================================================================================ šÆ SCENARIO 2: CHOKE POSITION STUCK ------------------------------------------------------------ Anomaly Type: Choke Position Stuck Model Detection Score: 4.475626 Detected as Anomaly: ā YES
Engineering Verdict: CONFIRMED ================================================================================ šÆ SCENARIO 3: UPSTREAM PRESSURE SPIKE ------------------------------------------------------------ Anomaly Type: Upstream Pressure Spike Model Detection Score: 5.306238 Detected as Anomaly: ā YES
Engineering Verdict: CONFIRMED ================================================================================ šÆ SCENARIO 4: DOWNSTREAM PRESSURE LOSS ------------------------------------------------------------ Anomaly Type: Downstream Pressure Loss Model Detection Score: 1.872751 Detected as Anomaly: ā NO
Engineering Verdict: REVIEW_NEEDED ā ļø This scenario may need manual review - low detection confidence ================================================================================ šÆ SCENARIO 5: TEMPERATURE SENSOR DRIFT ------------------------------------------------------------ Anomaly Type: Temperature Sensor Drift Model Detection Score: 0.343982 Detected as Anomaly: ā NO
Engineering Verdict: REVIEW_NEEDED ā ļø This scenario may need manual review - low detection confidence ================================================================================ ā STEP 10 COMPLETE: Comprehensive evaluation with detailed visualizations! š 3/5 anomalies successfully detected šÆ Detection rate: 60.0% š Model performance validated across 9 features
InĀ [32]:
# STEP 11: EXPERT-GRADE SYNTHETIC ANOMALY GENERATION
print("šØāš¬ CREATING REALISTIC DRILLING ANOMALIES FOR EXPERT VALIDATION...")
print("="*80)
def create_realistic_drilling_anomalies():
"""
Create realistic drilling anomalies based on actual drilling physics
Returns anomalies in REAL units for expert validation
"""
# First, get the original data ranges before normalization
print("š Analyzing original TAQA data ranges...")
# Get original data before normalization for realistic ranges
df_original = pd.read_parquet('/home/ashwinvel2000/TAQA/training_data/wide36_tools_flat.parquet')
df_sample = df_original.head(1000) # Same sample we used
# Create derived feature if needed
if 'Downstream-Upstream-Difference' not in df_sample.columns:
df_sample['Downstream-Upstream-Difference'] = df_sample['Downstream-Pressure'] - df_sample['Upstream-Pressure']
# Get realistic ranges for each feature
feature_ranges = {}
for feature in available_features:
if feature in df_sample.columns:
data = df_sample[feature].dropna()
feature_ranges[feature] = {
'min': data.min(),
'max': data.max(),
'mean': data.mean(),
'std': data.std(),
'p25': data.quantile(0.25),
'p75': data.quantile(0.75)
}
print(f" {feature}: {data.min():.2f} to {data.max():.2f} (mean: {data.mean():.2f})")
# Define drilling-realistic anomaly scenarios - COMPLETE SET
drilling_anomalies = {
# Original 5 anomalies (sensor_spike, sensor_drift, sensor_failure types)
'power_failure': {
'name': 'Power System Failure',
'description': 'Battery voltage drops below operational threshold',
'affected_feature': 'Battery-Voltage',
'physics': 'Battery voltage should be 12-14V, failure drops to 8-10V',
'severity': 'CRITICAL',
'detection_priority': 'HIGH',
'lstm_target': 'sensor_failure'
},
'choke_stuck': {
'name': 'Choke Valve Stuck',
'description': 'Choke position becomes unresponsive/stuck',
'affected_feature': 'Choke-Position',
'physics': 'Choke should vary 0-100%, stuck shows flat line',
'severity': 'HIGH',
'detection_priority': 'HIGH',
'lstm_target': 'sensor_failure'
},
'pressure_surge': {
'name': 'Pressure Surge/Kick',
'description': 'Sudden upstream pressure increase indicating formation fluid influx',
'affected_feature': 'Upstream-Pressure',
'physics': 'Normal 100-1000 psi, surge can reach 2000+ psi',
'severity': 'CRITICAL',
'detection_priority': 'CRITICAL',
'lstm_target': 'sensor_spike'
},
'pressure_loss': {
'name': 'Circulation Loss',
'description': 'Downstream pressure drops indicating lost circulation',
'affected_feature': 'Downstream-Pressure',
'physics': 'Pressure drops indicate fluid loss to formation',
'severity': 'HIGH',
'detection_priority': 'HIGH',
'lstm_target': 'sensor_drift'
},
'thermal_anomaly': {
'name': 'Thermal System Malfunction',
'description': 'Temperature readings become uncorrelated or drift',
'affected_feature': 'Upstream-Temperature',
'physics': 'Up/downstream temps should correlate, drift indicates sensor issues',
'severity': 'MEDIUM',
'detection_priority': 'MEDIUM',
'lstm_target': 'sensor_drift'
},
# Additional 4 anomalies for complete LSTM testing
'correlation_break': {
'name': 'Sensor Correlation Break',
'description': 'Upstream/downstream pressure correlation breakdown',
'affected_feature': 'Upstream-Pressure', # Primary, but affects correlation
'physics': 'Up/downstream pressures should correlate, break indicates system failure',
'severity': 'HIGH',
'detection_priority': 'HIGH',
'lstm_target': 'correlation_break'
},
'temporal_inversion': {
'name': 'Temporal Pattern Inversion',
'description': 'Temperature trend reversal (impossible physics)',
'affected_feature': 'Downstream-Temperature',
'physics': 'Temperature patterns reversed - physically impossible sequence',
'severity': 'CRITICAL',
'detection_priority': 'CRITICAL',
'lstm_target': 'temporal_inversion'
},
'multi_sensor_failure': {
'name': 'Cascading System Failure',
'description': 'Multiple sensors failing in sequence (propagating failure)',
'affected_feature': 'Battery-Voltage', # Primary, triggers cascade
'physics': 'Power failure causes cascading sensor malfunctions',
'severity': 'CRITICAL',
'detection_priority': 'CRITICAL',
'lstm_target': 'multi_sensor_failure'
},
'oscillation': {
'name': 'Abnormal Oscillation',
'description': 'Choke position shows abnormal high-frequency oscillations',
'affected_feature': 'Choke-Position',
'physics': 'Choke should be stable, oscillations indicate control system malfunction',
'severity': 'MEDIUM',
'detection_priority': 'MEDIUM',
'lstm_target': 'oscillation'
}
}
# Create synthetic anomalies in REAL units
expert_dataset = {
'normal_examples': [],
'anomaly_examples': {},
'metadata': {}
}
print(f"\nš§ Generating realistic anomalies...")
# Get some normal sequences (convert back to real units)
normal_sequences_norm = X[:3].numpy() # First 3 sequences
normal_sequences_real = scaler.inverse_transform(normal_sequences_norm.reshape(-1, len(available_features))).reshape(normal_sequences_norm.shape)
for i, seq in enumerate(normal_sequences_real):
expert_dataset['normal_examples'].append({
'sequence': seq,
'label': f'Normal Operation {i+1}',
'description': 'Typical drilling operation - all sensors within normal ranges'
})
# Generate anomalies for each type
for anomaly_key, anomaly_info in drilling_anomalies.items():
expert_dataset['anomaly_examples'][anomaly_key] = []
print(f" Creating {anomaly_info['name']}...")
# Create 3 examples per anomaly type
for example_num in range(3):
# Start with a normal sequence
base_seq_norm = X[example_num + 3].numpy() # Use sequences 3,4,5 as base
base_seq_real = scaler.inverse_transform(base_seq_norm.reshape(-1, len(available_features))).reshape(base_seq_norm.shape)
# Apply realistic anomaly based on drilling physics
anomaly_seq = base_seq_real.copy()
if anomaly_key == 'power_failure':
# Battery voltage drops from ~13V to ~9V
battery_idx = available_features.index('Battery-Voltage')
drop_start = len(anomaly_seq) // 3
# Gradual voltage drop
for t in range(drop_start, len(anomaly_seq)):
drop_factor = 0.65 + 0.05 * np.random.randn() # 9V from 13V with noise
anomaly_seq[t, battery_idx] = anomaly_seq[0, battery_idx] * drop_factor
elif anomaly_key == 'choke_stuck':
# Choke position becomes flat/stuck
choke_idx = available_features.index('Choke-Position')
stuck_start = len(anomaly_seq) // 4
stuck_value = anomaly_seq[stuck_start, choke_idx]
anomaly_seq[stuck_start:, choke_idx] = stuck_value + np.random.normal(0, 0.5, len(anomaly_seq) - stuck_start)
elif anomaly_key == 'pressure_surge':
# Sudden pressure increase (kick)
pressure_idx = available_features.index('Upstream-Pressure')
surge_start = len(anomaly_seq) // 2
surge_duration = 4
baseline = anomaly_seq[surge_start, pressure_idx]
surge_magnitude = baseline * 1.8 + np.random.uniform(200, 500) # Significant pressure increase
for t in range(surge_start, min(surge_start + surge_duration, len(anomaly_seq))):
anomaly_seq[t, pressure_idx] = surge_magnitude + np.random.normal(0, 50)
elif anomaly_key == 'pressure_loss':
# Gradual pressure loss
pressure_idx = available_features.index('Downstream-Pressure')
loss_start = len(anomaly_seq) // 3
baseline = anomaly_seq[loss_start, pressure_idx]
for t in range(loss_start, len(anomaly_seq)):
loss_factor = 0.3 + 0.4 * (t - loss_start) / (len(anomaly_seq) - loss_start) # Gradual loss to 30%
anomaly_seq[t, pressure_idx] = baseline * loss_factor + np.random.normal(0, 10)
elif anomaly_key == 'thermal_anomaly':
# Temperature sensor drift
temp_idx = available_features.index('Upstream-Temperature')
drift_start = len(anomaly_seq) // 5
drift_amount = np.random.uniform(15, 25) # 15-25 degree drift
for t in range(drift_start, len(anomaly_seq)):
drift_progress = (t - drift_start) / (len(anomaly_seq) - drift_start)
anomaly_seq[t, temp_idx] += drift_amount * drift_progress + np.random.normal(0, 2)
elif anomaly_key == 'correlation_break':
# Break upstream/downstream pressure correlation
up_pressure_idx = available_features.index('Upstream-Pressure')
down_pressure_idx = available_features.index('Downstream-Pressure')
break_start = len(anomaly_seq) // 3
# After break_start, make downstream pressure independent of upstream
for t in range(break_start, len(anomaly_seq)):
# Upstream continues normal trend
noise_factor = 1 + np.random.normal(0, 0.1)
anomaly_seq[t, up_pressure_idx] = anomaly_seq[t-1, up_pressure_idx] * noise_factor
# Downstream becomes uncorrelated (random walk)
independent_change = np.random.uniform(-50, 50)
anomaly_seq[t, down_pressure_idx] = max(0, anomaly_seq[t-1, down_pressure_idx] + independent_change)
elif anomaly_key == 'temporal_inversion':
# Reverse temperature trend (physically impossible)
temp_idx = available_features.index('Downstream-Temperature')
inversion_start = len(anomaly_seq) // 4
# Take the normal trend and reverse it
baseline_segment = anomaly_seq[inversion_start:, temp_idx].copy()
inverted_segment = baseline_segment[::-1] # Reverse the sequence
# Add some noise to make it more realistic but still wrong
inverted_segment += np.random.normal(0, 1, len(inverted_segment))
anomaly_seq[inversion_start:, temp_idx] = inverted_segment
elif anomaly_key == 'multi_sensor_failure':
# Cascading failure: Battery -> Pressures -> Temperatures
battery_idx = available_features.index('Battery-Voltage')
up_pressure_idx = available_features.index('Upstream-Pressure')
down_pressure_idx = available_features.index('Downstream-Pressure')
up_temp_idx = available_features.index('Upstream-Temperature')
down_temp_idx = available_features.index('Downstream-Temperature')
# Stage 1: Battery failure (timestep 4-6)
fail_start_1 = 4
for t in range(fail_start_1, min(fail_start_1 + 3, len(anomaly_seq))):
anomaly_seq[t, battery_idx] *= 0.7 # Voltage drops
# Stage 2: Pressure sensors affected (timestep 7-10)
fail_start_2 = 7
for t in range(fail_start_2, min(fail_start_2 + 4, len(anomaly_seq))):
anomaly_seq[t, up_pressure_idx] += np.random.uniform(-100, -200) # Erratic readings
anomaly_seq[t, down_pressure_idx] += np.random.uniform(-80, -150)
# Stage 3: Temperature sensors drift (timestep 11+)
fail_start_3 = 11
for t in range(fail_start_3, len(anomaly_seq)):
temp_drift = (t - fail_start_3) * 2 # Progressive drift
anomaly_seq[t, up_temp_idx] += temp_drift + np.random.normal(0, 3)
anomaly_seq[t, down_temp_idx] += temp_drift * 0.8 + np.random.normal(0, 2)
elif anomaly_key == 'oscillation':
# High-frequency oscillations in choke position
choke_idx = available_features.index('Choke-Position')
osc_start = len(anomaly_seq) // 4
baseline = anomaly_seq[osc_start, choke_idx]
frequency = 0.8 # High frequency oscillation
amplitude = np.random.uniform(3, 7) # 3-7% oscillation amplitude
for t in range(osc_start, len(anomaly_seq)):
oscillation = amplitude * np.sin(frequency * (t - osc_start))
anomaly_seq[t, choke_idx] = baseline + oscillation + np.random.normal(0, 0.5)
expert_dataset['anomaly_examples'][anomaly_key].append({
'sequence': anomaly_seq,
'label': f'{anomaly_info["name"]} - Example {example_num + 1}',
'description': anomaly_info['description'],
'physics': anomaly_info['physics'],
'severity': anomaly_info['severity'],
'affected_feature': anomaly_info['affected_feature']
})
# Store metadata
expert_dataset['metadata'] = {
'features': available_features,
'feature_ranges': feature_ranges,
'sequence_length': len(normal_sequences_real[0]),
'anomaly_types': drilling_anomalies,
'units': {
'Battery-Voltage': 'Volts (V)',
'Choke-Position': 'Percentage (%)',
'Upstream-Pressure': 'PSI',
'Downstream-Pressure': 'PSI',
'Upstream-Temperature': 'Degrees F',
'Downstream-Temperature': 'Degrees F',
'Downstream-Upstream-Difference': 'PSI'
}
}
return expert_dataset
# Generate the expert validation dataset
try:
expert_validation_data = create_realistic_drilling_anomalies()
print(f"\nā
EXPERT VALIDATION DATASET CREATED:")
print(f" Normal examples: {len(expert_validation_data['normal_examples'])}")
print(f" Anomaly types: {len(expert_validation_data['anomaly_examples'])}")
total_anomalies = sum(len(examples) for examples in expert_validation_data['anomaly_examples'].values())
print(f" Total anomaly examples: {total_anomalies}")
print(f" Features with real units: {len(expert_validation_data['metadata']['features'])}")
print(f"\nš ANOMALY TYPES FOR EXPERT REVIEW:")
for anomaly_type, examples in expert_validation_data['anomaly_examples'].items():
example_info = examples[0] # Get first example for info
print(f" ⢠{example_info['label']}: {example_info['severity']} severity")
print(f" Physics: {example_info['physics']}")
print(f"\nā
STEP 11 COMPLETE: Realistic drilling anomalies created!")
print(f"š Ready for expert validation interface...")
except Exception as e:
print(f"ā Expert dataset creation failed: {e}")
import traceback
traceback.print_exc()
šØāš¬ CREATING REALISTIC DRILLING ANOMALIES FOR EXPERT VALIDATION...
================================================================================
š Analyzing original TAQA data ranges...
Battery-Voltage: 13.54 to 14.16 (mean: 14.14)
Choke-Position: -1.08 to 100.92 (mean: 88.94)
Upstream-Pressure: 19.13 to 1154.38 (mean: 973.43)
Downstream-Pressure: 15.37 to 1158.94 (mean: 976.80)
Upstream-Temperature: 14.20 to 14.32 (mean: 14.27)
Downstream-Temperature: 14.12 to 14.23 (mean: 14.19)
Target-Position: 0.00 to 100.00 (mean: 88.70)
Tool-State: 1.00 to 5.00 (mean: 1.91)
Downstream-Upstream-Difference: -6.47 to 6.45 (mean: 3.37)
š§ Generating realistic anomalies...
Creating Power System Failure...
Creating Choke Valve Stuck...
Creating Pressure Surge/Kick...
Creating Circulation Loss...
Creating Thermal System Malfunction...
Creating Sensor Correlation Break...
Creating Temporal Pattern Inversion...
Creating Cascading System Failure...
Creating Abnormal Oscillation...
ā
EXPERT VALIDATION DATASET CREATED:
Normal examples: 3
Anomaly types: 9
Total anomaly examples: 27
Features with real units: 9
š ANOMALY TYPES FOR EXPERT REVIEW:
⢠Power System Failure - Example 1: CRITICAL severity
Physics: Battery voltage should be 12-14V, failure drops to 8-10V
⢠Choke Valve Stuck - Example 1: HIGH severity
Physics: Choke should vary 0-100%, stuck shows flat line
⢠Pressure Surge/Kick - Example 1: CRITICAL severity
Physics: Normal 100-1000 psi, surge can reach 2000+ psi
⢠Circulation Loss - Example 1: HIGH severity
Physics: Pressure drops indicate fluid loss to formation
⢠Thermal System Malfunction - Example 1: MEDIUM severity
Physics: Up/downstream temps should correlate, drift indicates sensor issues
⢠Sensor Correlation Break - Example 1: HIGH severity
Physics: Up/downstream pressures should correlate, break indicates system failure
⢠Temporal Pattern Inversion - Example 1: CRITICAL severity
Physics: Temperature patterns reversed - physically impossible sequence
⢠Cascading System Failure - Example 1: CRITICAL severity
Physics: Power failure causes cascading sensor malfunctions
⢠Abnormal Oscillation - Example 1: MEDIUM severity
Physics: Choke should be stable, oscillations indicate control system malfunction
ā
STEP 11 COMPLETE: Realistic drilling anomalies created!
š Ready for expert validation interface...
InĀ [33]:
# STEP 12: COMPREHENSIVE EXPERT VALIDATION INTERFACE
print("šØāš¼ DRILLING EXPERT VALIDATION DASHBOARD")
print("="*80)
def create_expert_validation_dashboard():
"""
Create comprehensive visual dashboard for drilling expert validation
Shows all anomalies in real drilling units with clear comparisons
"""
print("šÆ Preparing expert validation dashboard...")
# Get reference normal sequence for comparison
reference_normal = expert_validation_data['normal_examples'][0]['sequence']
features = expert_validation_data['metadata']['features']
units = expert_validation_data['metadata']['units']
print(f"\nš DRILLING EXPERT VALIDATION DASHBOARD")
print(f"Dataset: TAQA Drilling Operations")
print(f"Features: {len(features)} sensor channels")
print(f"Sequence Length: {expert_validation_data['metadata']['sequence_length']} time steps")
print(f"Units: Real drilling measurements (not normalized)")
# ============================================================================
# SECTION 1: NORMAL BEHAVIOR VALIDATION
# ============================================================================
print(f"\n" + "="*100)
print(f"ā
SECTION 1: NORMAL DRILLING BEHAVIOR VALIDATION")
print(f"Purpose: Verify that baseline operations look realistic to drilling experts")
print("="*100)
# Show normal behavior patterns
fig, axes = plt.subplots(3, 3, figsize=(20, 15))
fig.suptitle('EXPERT VALIDATION: Normal Drilling Operations\\n'
'Verify: Do these patterns represent typical drilling behavior?',
fontsize=16, fontweight='bold', color='green')
# Plot all normal examples
normal_examples = expert_validation_data['normal_examples']
colors = ['darkgreen', 'forestgreen', 'limegreen']
for feat_idx, feature_name in enumerate(features):
row, col = feat_idx // 3, feat_idx % 3
ax = axes[row, col]
time_steps = range(len(normal_examples[0]['sequence']))
# Plot all normal examples
for ex_idx, example in enumerate(normal_examples):
ax.plot(time_steps, example['sequence'][:, feat_idx],
color=colors[ex_idx], linewidth=2, alpha=0.8,
label=f'Normal Example {ex_idx + 1}')
# Formatting
ax.set_title(f'{feature_name}\\n({units.get(feature_name, "Units")})',
fontweight='bold', fontsize=12)
ax.set_xlabel('Time Step')
ax.set_ylabel('Value')
ax.grid(True, alpha=0.3)
ax.legend(fontsize=8)
ax.set_facecolor('#f0fff0') # Light green background
plt.tight_layout()
plt.show()
print(f"\\nš NORMAL BEHAVIOR VALIDATION CHECKLIST:")
print(f"1. ā Do these sensor readings look like typical drilling operations?")
print(f"2. ā Are all values within expected operational ranges?")
print(f"3. ā Do sensor correlations make physical sense?")
print(f"4. ā Are temporal patterns realistic for drilling sequences?")
print(f"5. ā Would you expect the LSTM to learn these as 'normal'?")
print(f"\\nš NORMAL BEHAVIOR SUMMARY:")
for ex_idx, example in enumerate(normal_examples):
print(f" Normal Example {ex_idx + 1}: {example['description']}")
print(f"\\nā
Normal behavior validation complete - proceeding to anomaly validation...")
# ============================================================================
# SECTION 2: ANOMALY BEHAVIOR VALIDATION
# ============================================================================
print(f"\\n" + "="*100)
print(f"šØ SECTION 2: ANOMALY BEHAVIOR VALIDATION")
print(f"Purpose: Verify synthetic anomalies match real drilling failure modes")
print(f"LSTM Targets: sensor_spike, sensor_drift, sensor_failure, correlation_break,")
print(f" temporal_inversion, multi_sensor_failure, oscillation")
print("="*100)
# Create validation interface for each anomaly type
validation_results = {}
for anomaly_type, examples in expert_validation_data['anomaly_examples'].items():
anomaly_info = expert_validation_data['metadata']['anomaly_types'][anomaly_type]
print(f"\n" + "="*100)
print(f"š ANOMALY TYPE: {examples[0]['label'].split(' - ')[0].upper()}")
print(f"Severity: {examples[0]['severity']} | Physics: {examples[0]['physics']}")
print(f"Affected Sensor: {examples[0]['affected_feature']}")
print(f"LSTM Target: {anomaly_info['lstm_target']} (tests LSTM's ability to detect {anomaly_info['lstm_target']})")
print("="*100)
# Show all examples for this anomaly type
fig, axes = plt.subplots(3, 3, figsize=(20, 15))
fig.suptitle(f'EXPERT VALIDATION: {examples[0]["label"].split(" - ")[0]}\\n'
f'Severity: {examples[0]["severity"]} | Affected: {examples[0]["affected_feature"]}',
fontsize=16, fontweight='bold', color='red')
# Plot all 9 features
for feat_idx, feature_name in enumerate(features):
row, col = feat_idx // 3, feat_idx % 3
ax = axes[row, col]
# Plot normal baseline (gray)
time_steps = range(len(reference_normal))
ax.plot(time_steps, reference_normal[:, feat_idx],
color='gray', linewidth=2, alpha=0.7, label='Normal Baseline', linestyle='--')
# Plot all examples of this anomaly type
colors = ['red', 'darkred', 'crimson']
for ex_idx, example in enumerate(examples):
ax.plot(time_steps, example['sequence'][:, feat_idx],
color=colors[ex_idx], linewidth=2, alpha=0.8,
label=f'Anomaly Example {ex_idx + 1}')
# Formatting
ax.set_title(f'{feature_name}\\n({units.get(feature_name, "Units")})',
fontweight='bold', fontsize=12)
ax.set_xlabel('Time Step')
ax.set_ylabel('Value')
ax.grid(True, alpha=0.3)
ax.legend(fontsize=8)
# Highlight affected feature
if feature_name == examples[0]['affected_feature']:
ax.set_facecolor('#ffe6e6') # Light red background
ax.set_title(f'šÆ {feature_name} (AFFECTED)\\n({units.get(feature_name, "Units")})',
fontweight='bold', fontsize=12, color='red')
plt.tight_layout()
plt.show()
# Expert validation questions
print(f"\\nš EXPERT VALIDATION CHECKLIST:")
print(f"1. ā Does the {examples[0]['affected_feature']} anomaly look realistic?")
print(f"2. ā Are the values within expected drilling ranges?")
print(f"3. ā Does the pattern match real {examples[0]['label'].split(' - ')[0].lower()} scenarios?")
print(f"4. ā Are other sensors responding appropriately?")
print(f"5. ā Would this trigger alerts in real drilling operations?")
# Show detailed comparison for affected feature
affected_feature = examples[0]['affected_feature']
affected_idx = features.index(affected_feature)
plt.figure(figsize=(15, 6))
plt.subplot(1, 2, 1)
# Normal vs anomaly comparison for affected feature
plt.plot(time_steps, reference_normal[:, affected_idx],
'g-', linewidth=3, label='Normal Operation', alpha=0.8)
for ex_idx, example in enumerate(examples):
plt.plot(time_steps, example['sequence'][:, affected_idx],
color=colors[ex_idx], linewidth=2, alpha=0.9,
label=f'Anomaly Example {ex_idx + 1}')
plt.title(f'DETAILED VIEW: {affected_feature}\\n{examples[0]["physics"]}',
fontweight='bold', fontsize=14)
plt.xlabel('Time Step')
plt.ylabel(f'{affected_feature} ({units.get(affected_feature, "Units")})')
plt.legend()
plt.grid(True, alpha=0.3)
# Show value distributions
plt.subplot(1, 2, 2)
normal_values = reference_normal[:, affected_idx]
plt.hist(normal_values, bins=15, alpha=0.7, color='green',
label='Normal Distribution', density=True)
for ex_idx, example in enumerate(examples):
anomaly_values = example['sequence'][:, affected_idx]
plt.hist(anomaly_values, bins=15, alpha=0.6, color=colors[ex_idx],
label=f'Anomaly {ex_idx + 1}', density=True)
plt.title(f'Value Distribution Comparison', fontweight='bold')
plt.xlabel(f'{affected_feature} ({units.get(affected_feature, "Units")})')
plt.ylabel('Density')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Drilling context
print(f"\\nš ļø DRILLING CONTEXT:")
print(f"Description: {examples[0]['description']}")
print(f"Physics: {examples[0]['physics']}")
print(f"Severity: {examples[0]['severity']}")
print(f"Expected Response: This anomaly should {'IMMEDIATELY' if examples[0]['severity'] == 'CRITICAL' else 'PROMPTLY'} trigger alerts")
validation_results[anomaly_type] = {
'anomaly_name': examples[0]['label'].split(' - ')[0],
'severity': examples[0]['severity'],
'affected_feature': examples[0]['affected_feature'],
'examples_count': len(examples)
}
return validation_results
# Run the expert validation dashboard
try:
validation_summary = create_expert_validation_dashboard()
print(f"\\n\\nš EXPERT VALIDATION DASHBOARD COMPLETE!")
print(f"="*80)
print(f"ā
Created comprehensive validation interface for drilling expert")
print(f"š Normal examples: 3 | Anomaly types: {len(validation_summary)}")
print(f"šÆ All features shown in real drilling units")
print(f"š Visual comparisons with normal baselines provided")
print(f"\\nš COMPLETE VALIDATION SUMMARY:")
print(f" NORMAL BEHAVIOR:")
print(f" ⢠3 examples of typical drilling operations")
print(f" \\n ANOMALY TYPES (Complete LSTM Test Suite):")
lstm_targets = {}
for anomaly_type, info in validation_summary.items():
target = expert_validation_data['metadata']['anomaly_types'][anomaly_type]['lstm_target']
if target not in lstm_targets:
lstm_targets[target] = []
lstm_targets[target].append(info['anomaly_name'])
print(f" ⢠{info['anomaly_name']}: {info['severity']} severity")
print(f" Affects: {info['affected_feature']} | LSTM Target: {target}")
print(f"\\nš§ LSTM DETECTION CAPABILITIES TESTED:")
for target, anomalies in lstm_targets.items():
print(f" ⢠{target}: {', '.join(anomalies)}")
print(f"\\nš READY FOR EXPERT REVIEW!")
print(f"Expert can now validate each pattern with:")
print(f" ā Real drilling units (PSI, Volts, °F, %)")
print(f" ā All 9 sensor channels visible")
print(f" ā Normal vs anomaly comparisons")
print(f" ā Drilling physics context")
print(f" ā LSTM detection target identification")
print(f" ā Clear validation checklists")
except Exception as e:
print(f"ā Expert validation dashboard failed: {e}")
import traceback
traceback.print_exc()
šØāš¼ DRILLING EXPERT VALIDATION DASHBOARD ================================================================================ šÆ Preparing expert validation dashboard... š DRILLING EXPERT VALIDATION DASHBOARD Dataset: TAQA Drilling Operations Features: 9 sensor channels Sequence Length: 15 time steps Units: Real drilling measurements (not normalized) ==================================================================================================== ā SECTION 1: NORMAL DRILLING BEHAVIOR VALIDATION Purpose: Verify that baseline operations look realistic to drilling experts ====================================================================================================
\nš NORMAL BEHAVIOR VALIDATION CHECKLIST:
1. ā Do these sensor readings look like typical drilling operations?
2. ā Are all values within expected operational ranges?
3. ā Do sensor correlations make physical sense?
4. ā Are temporal patterns realistic for drilling sequences?
5. ā Would you expect the LSTM to learn these as 'normal'?
\nš NORMAL BEHAVIOR SUMMARY:
Normal Example 1: Typical drilling operation - all sensors within normal ranges
Normal Example 2: Typical drilling operation - all sensors within normal ranges
Normal Example 3: Typical drilling operation - all sensors within normal ranges
\nā
Normal behavior validation complete - proceeding to anomaly validation...
\n====================================================================================================
šØ SECTION 2: ANOMALY BEHAVIOR VALIDATION
Purpose: Verify synthetic anomalies match real drilling failure modes
LSTM Targets: sensor_spike, sensor_drift, sensor_failure, correlation_break,
temporal_inversion, multi_sensor_failure, oscillation
====================================================================================================
====================================================================================================
š ANOMALY TYPE: POWER SYSTEM FAILURE
Severity: CRITICAL | Physics: Battery voltage should be 12-14V, failure drops to 8-10V
Affected Sensor: Battery-Voltage
LSTM Target: sensor_failure (tests LSTM's ability to detect sensor_failure)
====================================================================================================
/tmp/ipykernel_1179/3657439822.py:138: UserWarning: Glyph 127919 (\N{DIRECT HIT}) missing from font(s) DejaVu Sans.
plt.tight_layout()
/home/ashwinvel2000/TAQA/.venv/lib/python3.12/site-packages/IPython/core/pylabtools.py:170: UserWarning: Glyph 127919 (\N{DIRECT HIT}) missing from font(s) DejaVu Sans.
fig.canvas.print_figure(bytes_io, **kw)
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Battery-Voltage anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real power system failure scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT: Description: Battery voltage drops below operational threshold Physics: Battery voltage should be 12-14V, failure drops to 8-10V Severity: CRITICAL Expected Response: This anomaly should IMMEDIATELY trigger alerts ==================================================================================================== š ANOMALY TYPE: CHOKE VALVE STUCK Severity: HIGH | Physics: Choke should vary 0-100%, stuck shows flat line Affected Sensor: Choke-Position LSTM Target: sensor_failure (tests LSTM's ability to detect sensor_failure) ====================================================================================================
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Choke-Position anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real choke valve stuck scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT: Description: Choke position becomes unresponsive/stuck Physics: Choke should vary 0-100%, stuck shows flat line Severity: HIGH Expected Response: This anomaly should PROMPTLY trigger alerts ==================================================================================================== š ANOMALY TYPE: PRESSURE SURGE/KICK Severity: CRITICAL | Physics: Normal 100-1000 psi, surge can reach 2000+ psi Affected Sensor: Upstream-Pressure LSTM Target: sensor_spike (tests LSTM's ability to detect sensor_spike) ====================================================================================================
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Upstream-Pressure anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real pressure surge/kick scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT: Description: Sudden upstream pressure increase indicating formation fluid influx Physics: Normal 100-1000 psi, surge can reach 2000+ psi Severity: CRITICAL Expected Response: This anomaly should IMMEDIATELY trigger alerts ==================================================================================================== š ANOMALY TYPE: CIRCULATION LOSS Severity: HIGH | Physics: Pressure drops indicate fluid loss to formation Affected Sensor: Downstream-Pressure LSTM Target: sensor_drift (tests LSTM's ability to detect sensor_drift) ====================================================================================================
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Downstream-Pressure anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real circulation loss scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT: Description: Downstream pressure drops indicating lost circulation Physics: Pressure drops indicate fluid loss to formation Severity: HIGH Expected Response: This anomaly should PROMPTLY trigger alerts ==================================================================================================== š ANOMALY TYPE: THERMAL SYSTEM MALFUNCTION Severity: MEDIUM | Physics: Up/downstream temps should correlate, drift indicates sensor issues Affected Sensor: Upstream-Temperature LSTM Target: sensor_drift (tests LSTM's ability to detect sensor_drift) ====================================================================================================
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Upstream-Temperature anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real thermal system malfunction scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT: Description: Temperature readings become uncorrelated or drift Physics: Up/downstream temps should correlate, drift indicates sensor issues Severity: MEDIUM Expected Response: This anomaly should PROMPTLY trigger alerts ==================================================================================================== š ANOMALY TYPE: SENSOR CORRELATION BREAK Severity: HIGH | Physics: Up/downstream pressures should correlate, break indicates system failure Affected Sensor: Upstream-Pressure LSTM Target: correlation_break (tests LSTM's ability to detect correlation_break) ====================================================================================================
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Upstream-Pressure anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real sensor correlation break scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT: Description: Upstream/downstream pressure correlation breakdown Physics: Up/downstream pressures should correlate, break indicates system failure Severity: HIGH Expected Response: This anomaly should PROMPTLY trigger alerts ==================================================================================================== š ANOMALY TYPE: TEMPORAL PATTERN INVERSION Severity: CRITICAL | Physics: Temperature patterns reversed - physically impossible sequence Affected Sensor: Downstream-Temperature LSTM Target: temporal_inversion (tests LSTM's ability to detect temporal_inversion) ====================================================================================================
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Downstream-Temperature anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real temporal pattern inversion scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT: Description: Temperature trend reversal (impossible physics) Physics: Temperature patterns reversed - physically impossible sequence Severity: CRITICAL Expected Response: This anomaly should IMMEDIATELY trigger alerts ==================================================================================================== š ANOMALY TYPE: CASCADING SYSTEM FAILURE Severity: CRITICAL | Physics: Power failure causes cascading sensor malfunctions Affected Sensor: Battery-Voltage LSTM Target: multi_sensor_failure (tests LSTM's ability to detect multi_sensor_failure) ====================================================================================================
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Battery-Voltage anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real cascading system failure scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT: Description: Multiple sensors failing in sequence (propagating failure) Physics: Power failure causes cascading sensor malfunctions Severity: CRITICAL Expected Response: This anomaly should IMMEDIATELY trigger alerts ==================================================================================================== š ANOMALY TYPE: ABNORMAL OSCILLATION Severity: MEDIUM | Physics: Choke should be stable, oscillations indicate control system malfunction Affected Sensor: Choke-Position LSTM Target: oscillation (tests LSTM's ability to detect oscillation) ====================================================================================================
\nš EXPERT VALIDATION CHECKLIST: 1. ā Does the Choke-Position anomaly look realistic? 2. ā Are the values within expected drilling ranges? 3. ā Does the pattern match real abnormal oscillation scenarios? 4. ā Are other sensors responding appropriately? 5. ā Would this trigger alerts in real drilling operations?
\nš ļø DRILLING CONTEXT:
Description: Choke position shows abnormal high-frequency oscillations
Physics: Choke should be stable, oscillations indicate control system malfunction
Severity: MEDIUM
Expected Response: This anomaly should PROMPTLY trigger alerts
\n\nš EXPERT VALIDATION DASHBOARD COMPLETE!
================================================================================
ā
Created comprehensive validation interface for drilling expert
š Normal examples: 3 | Anomaly types: 9
šÆ All features shown in real drilling units
š Visual comparisons with normal baselines provided
\nš COMPLETE VALIDATION SUMMARY:
NORMAL BEHAVIOR:
⢠3 examples of typical drilling operations
\n ANOMALY TYPES (Complete LSTM Test Suite):
⢠Power System Failure: CRITICAL severity
Affects: Battery-Voltage | LSTM Target: sensor_failure
⢠Choke Valve Stuck: HIGH severity
Affects: Choke-Position | LSTM Target: sensor_failure
⢠Pressure Surge/Kick: CRITICAL severity
Affects: Upstream-Pressure | LSTM Target: sensor_spike
⢠Circulation Loss: HIGH severity
Affects: Downstream-Pressure | LSTM Target: sensor_drift
⢠Thermal System Malfunction: MEDIUM severity
Affects: Upstream-Temperature | LSTM Target: sensor_drift
⢠Sensor Correlation Break: HIGH severity
Affects: Upstream-Pressure | LSTM Target: correlation_break
⢠Temporal Pattern Inversion: CRITICAL severity
Affects: Downstream-Temperature | LSTM Target: temporal_inversion
⢠Cascading System Failure: CRITICAL severity
Affects: Battery-Voltage | LSTM Target: multi_sensor_failure
⢠Abnormal Oscillation: MEDIUM severity
Affects: Choke-Position | LSTM Target: oscillation
\nš§ LSTM DETECTION CAPABILITIES TESTED:
⢠sensor_failure: Power System Failure, Choke Valve Stuck
⢠sensor_spike: Pressure Surge/Kick
⢠sensor_drift: Circulation Loss, Thermal System Malfunction
⢠correlation_break: Sensor Correlation Break
⢠temporal_inversion: Temporal Pattern Inversion
⢠multi_sensor_failure: Cascading System Failure
⢠oscillation: Abnormal Oscillation
\nš READY FOR EXPERT REVIEW!
Expert can now validate each pattern with:
ā Real drilling units (PSI, Volts, °F, %)
ā All 9 sensor channels visible
ā Normal vs anomaly comparisons
ā Drilling physics context
ā LSTM detection target identification
ā Clear validation checklists